AITopics | multiple choice learning

Collaborating Authors

multiple choice learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Resilient Multiple Choice Learning: A learned scoring scheme with application to audio scene analysis

Neural Information Processing SystemsApr-25-2026, 02:43:02 GMT

We introduce Resilient Multiple Choice Learning (rMCL), an extension of the MCL approach for conditional distribution estimation in regression settings where multiple targets may be sampled for each training input. Multiple Choice Learning is a simple framework to tackle multimodal density estimation, using the WinnerTakes-All (WTA) loss for a set of hypotheses. In regression settings, the existing MCL variants focus on merging the hypotheses, thereby eventually sacrificing the diversity of the predictions. In contrast, our method relies on a novel learned scoring scheme underpinned by a mathematical framework based on Voronoi tessellations of the output space, from which we can derive a probabilistic interpretation. After empirically validating rMCL with experiments on synthetic data, we further assess its merits on the sound source localization task, demonstrating its practical usefulness and the relevance of its interpretation.

artificial intelligence, hypothesis, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Industry: Education (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

Stefan Lee, Senthil Purushwalkam Shiva Prakash, Michael Cogswell, Viresh Ranjan, David Crandall, Dhruv Batra

Neural Information Processing SystemsMar-23-2026, 03:37:22 GMT

Many practical perception systems exist within larger processes that include interactions with users or additional components capable of evaluating the quality of predicted solutions. In these contexts, it is beneficial to provide these oracle mechanisms with multiple highly likely hypotheses rather than a single prediction. In this work, we pose the task of producing multiple outputs as a learning problem over an ensemble of deep networks - introducing a novel stochastic gradient descent based approach to minimize the loss with respect to an oracle. Our method is simple to implement, agnostic to both architecture and loss function, and parameter-free. Our approach achieves lower oracle error compared to existing methods on a wide range of tasks and deep architectures. We also show qualitatively that the diverse solutions produced often provide interpretable representations of task ambiguity.

artificial intelligence, ensemble, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.69)

Industry:

Education (0.86)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Specialize with Knowledge Distillation for Visual Question Answering

Jonghwan Mun, Kimin Lee, Jinwoo Shin, Bohyung Han

Neural Information Processing SystemsFeb-12-2026, 06:00:46 GMT

Neural Information Processing Systems http://nips.cc/

accuracy, dataset, specialized model, (13 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

12d7ba753894ed348904df1bf0ce02ec-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 02:17:19 GMT

hypothesis, prediction, rmcl, (13 more...)

Neural Information Processing Systems

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Learning to Specialize with Knowledge Distillation for Visual Question Answering

Jonghwan Mun, Kimin Lee, Jinwoo Shin, Bohyung Han

Neural Information Processing SystemsNov-20-2025, 14:23:58 GMT

The proposed framework is model-agnostic and applicable to any tasks other than VQA, e .

artificial intelligence, machine learning, specialized model, (15 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

beac6bfb7eac3d651307c16ac747df01-Paper-Conference.pdf

Neural Information Processing SystemsAug-18-2025, 11:26:21 GMT

artificial intelligence, discriminator, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.05)
Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > California (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Multiple Choice Learning of Low Rank Adapters for Language Modeling

Letzelter, Victor, Malard, Hugo, Fontaine, Mathieu, Richard, Gaël, Essid, Slim, Bursuc, Andrei, Pérez, Patrick

arXiv.org Machine LearningJul-15-2025

We propose LoRA-MCL, a training scheme that extends next-token prediction in language models with a method designed to decode diverse, plausible sentence continuations at inference time. Traditional language modeling is an intrinsically ill-posed problem: given a context, multiple futures may be equally plausible. Our approach leverages Multiple Choice Learning (MCL) and the Winner-Takes-All (WTA) loss to efficiently handle ambiguity through Low-Rank Adaptation (LoRA). We provide a theoretical interpretation of applying Multiple Choice Learning to Language Modeling, assuming the data is generated from a mixture of distributions. To illustrate the proposed approach, we use data sampled from mixtures of Markov chains. We then demonstrate with extensive experiments on real-world visual and audio captioning tasks that our method achieves high diversity and relevance in generated outputs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2507.10419

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Education (0.91)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multiple Choice Learning for Efficient Speech Separation with Many Speakers

Perera, David, Derrida, François, Mariotte, Théo, Richard, Gaël, Essid, Slim

arXiv.org Machine LearningNov-27-2024

Training speech separation models in the supervised setting raises a permutation problem: finding the best assignation between the model predictions and the ground truth separated signals. This inherently ambiguous task is customarily solved using Permutation Invariant Training (PIT). In this article, we instead consider using the Multiple Choice Learning (MCL) framework, which was originally introduced to tackle ambiguous tasks. We demonstrate experimentally on the popular WSJ0-mix and LibriMix benchmarks that MCL matches the performances of PIT, while being computationally advantageous. This opens the door to a promising research direction, as MCL can be naturally extended to handle a variable number of speakers, or to tackle speech separation in the unsupervised setting.

mcl, separation, speech separation, (14 more...)

arXiv.org Machine Learning

2411.18497

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.82)

Industry: Education (0.62)

Technology:

Information Technology > Artificial Intelligence > Speech (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing

Perera, David, Letzelter, Victor, Mariotte, Théo, Cortés, Adrien, Chen, Mickael, Essid, Slim, Richard, Gaël

arXiv.org Machine LearningJul-22-2024

We introduce Annealed Multiple Choice Learning (aMCL) which combines simulated annealing with MCL. MCL is a learning framework handling ambiguous tasks by predicting a small set of plausible hypotheses. These hypotheses are trained using the Winner-takes-all (WTA) scheme, which promotes the diversity of the predictions. However, this scheme may converge toward an arbitrarily suboptimal local minimum, due to the greedy nature of WTA. We overcome this limitation using annealing, which enhances the exploration of the hypothesis space during training. We leverage insights from statistical physics and information theory to provide a detailed description of the model training trajectory. Additionally, we validate our algorithm by extensive experiments on synthetic datasets, on the standard UCI benchmark, and on speech separation.

amcl, experiment, hypothesis, (14 more...)

arXiv.org Machine Learning

2407.1558

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.92)

Industry: Education (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Speech (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Multiple Choice Learning: Learning to Produce Multiple Structured Outputs

Neural Information Processing SystemsMar-14-2024, 17:43:24 GMT

We address the problem of generating multiple hypotheses for structured prediction tasks that involve interaction with users or successive components in a cascaded architecture. Given a set of multiple hypotheses, such components/users typically have the ability to retrieve the best (or approximately the best) solution in this set. The standard approach for handling such a scenario is to first learn a single-output model and then produce M-Best Maximum a Posteriori (MAP) hypotheses from this model. In contrast, we learn to produce multiple outputs by formulating this task as a multiple-output structured-output prediction problem with a loss-function that effectively captures the setup of the problem. We present a max-margin formulation that minimizes an upper-bound on this lossfunction. Experimental results on image segmentation and protein side-chain prediction show that our method outperforms conventional approaches used for this type of scenario and leads to substantial improvements in prediction accuracy.

algorithm, prediction, predictor, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > United States > Virginia (0.04)

Industry: Education (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)

Add feedback